自动显微镜和定量图像分析的进展已促进了高含量筛查(HCS)作为有效的药物发现和研究工具。尽管HCS提供了高吞吐量图像的复杂细胞表型,但该过程可能会受到图像畸变的阻碍,例如异常图像模糊,荧光团饱和度,碎屑,高噪声,高水平的噪声,意外的自动荧光或空的图像。尽管此问题在文献中受到了温和的关注,但忽略这些人工制品会严重阻碍下游图像处理任务,并阻碍对细微表型的发现。因此,在HCS中使用质量控制是主要问题,也是先决条件。在这项工作中,我们评估了不需要大量图像注释的深度学习选项,即可为此问题提供直接且易于使用的半监督学习解决方案。具体而言,我们比较了最近的自我监督和转移学习方法的功效,以提供高吞吐量伪像图像检测器的基础编码器。这项研究的结果表明,对于此任务,应首选转移学习方法,因为它们不仅在这里表现出色,而且具有不需要敏感的超参数设置或大量额外培训的优势。
translated by 谷歌翻译
图像操纵和伪造探测现在已经成为了几十多年的研究。新时代工具和大规模的社交平台对被操纵的媒体提供了空间来茁壮成长。这些媒体可能是潜在的危险性,因此设计并测试了无数方法,以证明他们在检测伪造方面的鲁棒性。然而,最先进的系统报告的结果表明,监督方法实现几乎完美的性能,但仅具有特定的数据集。在这项工作中,我们通过几个实验分析了当前最先进的图像伪造检测技术的分配不可行性问题。我们的研究重点介绍了利用手工伪造检测的手工特征的模型。我们表明,开发方法无法在跨数据集评估和野外操纵介质上表现出色。因此,提出了一个问题关于所考虑的系统的当前评估和高估表现。注意:这项工作是在ITMR Lab,IIIT-Allahabad的夏季研究实习期间完成的,该工作于Anupam Agarwal教授。
translated by 谷歌翻译
Effective management of public shared spaces such as car parking space, is one challenging transformational aspect for many cities, especially in the developing World. By leveraging sensing technologies, cloud computing, and Artificial Intelligence, Cities are increasingly being managed smartly. Smart Cities not only bring convenience to City dwellers, but also improve their quality of life as advocated for by United Nations in the 2030 Sustainable Development Goal on Sustainable Cities and Communities. Through integration of Internet of Things and Cloud Computing, this paper presents a successful proof-of-concept implementation of a framework for managing public car parking spaces. Reservation of parking slots is done through a cloud-hosted application, while access to and out of the parking slot is enabled through Radio Frequency Identification (RFID) technology which in real-time, accordingly triggers update of the parking slot availability in the cloud-hosted database. This framework could bring considerable convenience to City dwellers since motorists only have to drive to a parking space when sure of a vacant parking slot, an important stride towards realization of sustainable smart cities and communities.
translated by 谷歌翻译
Driving through pothole infested roads is a life hazard and economically costly. The experience is even worse for motorists using the pothole filled road for the first time. Pothole-filled road networks have been associated with severe traffic jam especially during peak times of the day. Besides not being fuel consumption friendly and being time wasting, traffic jams often lead to increased carbon emissions as well as noise pollution. Moreover, the risk of fatal accidents has also been strongly associated with potholes among other road network factors. Discovering potholes prior to using a particular road is therefore of significant importance. This work presents a successful demonstration of sensor-based pothole mapping agent that captures both the pothole's depth as well as its location coordinates, parameters that are then used to generate a pothole map for the agent's entire journey. The map can thus be shared with all motorists intending to use the same route.
translated by 谷歌翻译
We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans under arbitrary viewpoints, body poses, and lighting. We only require a short video clip of the person to create the avatar and assume no knowledge about the lighting environment. We present a novel framework to model humans while disentangling their geometry, texture, and also lighting environment from monocular RGB videos. To simplify this otherwise ill-posed task we first estimate the coarse geometry and texture of the person via SMPL+D model fitting and then learn an articulated neural representation for photorealistic image generation. RANA first generates the normal and albedo maps of the person in any given target body pose and then uses spherical harmonics lighting to generate the shaded image in the target lighting environment. We also propose to pretrain RANA using synthetic images and demonstrate that it leads to better disentanglement between geometry and texture while also improving robustness to novel body poses. Finally, we also present a new photorealistic synthetic dataset, Relighting Humans, to quantitatively evaluate the performance of the proposed approach.
translated by 谷歌翻译
Denoising diffusion models hold great promise for generating diverse and realistic human motions. However, existing motion diffusion models largely disregard the laws of physics in the diffusion process and often generate physically-implausible motions with pronounced artifacts such as floating, foot sliding, and ground penetration. This seriously impacts the quality of generated motions and limits their real-world application. To address this issue, we present a novel physics-guided motion diffusion model (PhysDiff), which incorporates physical constraints into the diffusion process. Specifically, we propose a physics-based motion projection module that uses motion imitation in a physics simulator to project the denoised motion of a diffusion step to a physically-plausible motion. The projected motion is further used in the next diffusion step to guide the denoising diffusion process. Intuitively, the use of physics in our model iteratively pulls the motion toward a physically-plausible space. Experiments on large-scale human motion datasets show that our approach achieves state-of-the-art motion quality and improves physical plausibility drastically (>78% for all datasets).
translated by 谷歌翻译
We introduce an information-maximization approach for the Generalized Category Discovery (GCD) problem. Specifically, we explore a parametric family of loss functions evaluating the mutual information between the features and the labels, and find automatically the one that maximizes the predictive performances. Furthermore, we introduce the Elbow Maximum Centroid-Shift (EMaCS) technique, which estimates the number of classes in the unlabeled set. We report comprehensive experiments, which show that our mutual information-based approach (MIB) is both versatile and highly competitive under various GCD scenarios. The gap between the proposed approach and the existing methods is significant, more so when dealing with fine-grained classification problems. Our code: \url{https://github.com/fchiaroni/Mutual-Information-Based-GCD}.
translated by 谷歌翻译
Machine learning is the study of computer algorithms that can automatically improve based on data and experience. Machine learning algorithms build a model from sample data, called training data, to make predictions or judgments without being explicitly programmed to do so. A variety of wellknown machine learning algorithms have been developed for use in the field of computer science to analyze data. This paper introduced a new machine learning algorithm called impact learning. Impact learning is a supervised learning algorithm that can be consolidated in both classification and regression problems. It can furthermore manifest its superiority in analyzing competitive data. This algorithm is remarkable for learning from the competitive situation and the competition comes from the effects of autonomous features. It is prepared by the impacts of the highlights from the intrinsic rate of natural increase (RNI). We, moreover, manifest the prevalence of the impact learning over the conventional machine learning algorithm.
translated by 谷歌翻译
在过去的几年中,未配对的图像DeNoising取得了有希望的发展。无论表现如何,方法都倾向于严重依赖潜在的噪声属性或任何并不总是实用的假设。另外,如果我们可以从结构的角度而不是噪声统计数据解决问题,那么我们可以实现更强大的解决方案。通过这种动机,我们提出了一个自制的剥夺计划,该计划是不成功的,依赖于空间降解,然后进行正规化的精炼。我们的方法比以前的方法显示出显着改善,并且在不同的数据域上表现出一致的性能。
translated by 谷歌翻译
类激活图(CAM)有助于制定显着图,有助于解释深度神经网络的预测。基于梯度的方法通常比视力解释性的其他分支更快,并且独立于人类的指导。类似CAM的研究的性能取决于管理模型的层响应以及梯度的影响。典型的面向梯度的CAM研究依赖加权聚合来进行显着图估计,通过将梯度图投影到单权重值中,这可能导致过度的广义显着图。为了解决此问题,我们使用全球指导图来纠正显着性估计过程中加权聚合操作,在这种情况下,结果解释是相对干净的ER且特定于实例的。我们通过在特征图及其相应的梯度图之间执行元素乘法来获得全局引导图。为了验证我们的研究,我们将拟议的研究与八个不同的显着性可视化器进行了比较。此外,我们使用七个常用的评估指标进行定量比较。提出的方案比ImageNet,MS-Coco 14和Pascal VOC 2012数据集的测试图像取得了重大改进。
translated by 谷歌翻译